Elizabeth Bekele, Alison Cheek
2022-05-03
#This will allow us to filter through our data
library(tidyverse)
library(dplyr)
#This will help us plot figures to showcase our findings
library(ggplot2)
#This will help us organize and display our data as necessary
library(knitr)
library(kableExtra)
#This expands our plot uses
library(plotly)
#Scientific Notation Disabled
options(scipen=999)Import the deaths-due-to-air-pollution data
## Rows: 6,468
## Columns: 7
## $ Entity <chr> "Afghanistan", "Afghan…
## $ Code <chr> "AFG", "AFG", "AFG", "…
## $ Year <int> 1990, 1991, 1992, 1993…
## $ Air.pollution..total...deaths.per.100.000. <dbl> 299.4773, 291.2780, 27…
## $ Indoor.air.pollution..deaths.per.100.000. <dbl> 250.3629, 242.5751, 23…
## $ Outdoor.particulate.matter..deaths.per.100.000. <dbl> 46.44659, 46.03384, 44…
## $ Outdoor.ozone.pollution..deaths.per.100.000. <dbl> 5.616442, 5.603960, 5.…
We are going to rename a few of the columns and glimpse the data
colnames(deaths_df) <- c("country", "acronym", "year", "total_deaths", "indoor_deaths", "outdoor_deaths", "ozone_deaths")
glimpse(deaths_df)## Rows: 6,468
## Columns: 7
## $ country <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanist…
## $ acronym <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",…
## $ year <int> 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1…
## $ total_deaths <dbl> 299.4773, 291.2780, 278.9631, 278.7908, 287.1629, 288.0…
## $ indoor_deaths <dbl> 250.3629, 242.5751, 232.0439, 231.6481, 238.8372, 239.9…
## $ outdoor_deaths <dbl> 46.44659, 46.03384, 44.24377, 44.44015, 45.59433, 45.36…
## $ ozone_deaths <dbl> 5.616442, 5.603960, 5.611822, 5.655266, 5.718922, 5.739…
Variables that interest us here include:
Now, let’s take a look at the population data.
## Rows: 12,595
## Columns: 3
## $ Country.Name <chr> "Aruba", "Afghanistan", "Angola", "Albania", "Andorra", "…
## $ Year <int> 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 196…
## $ Count <int> 54211, 8996973, 5454933, 1608800, 13411, 92418, 20481779,…
To get a general idea of ‘deaths-dataframe’ we made, let’s make a plots to see what’s happening. This is a plot of indoor x outdoor deaths around the world by country.
This is a mess, and so we chose two countries from each continent (a high-population and a low-population country) to graph.
We selected a high population from each continent and used the formula below to determine the low population.
Low population = high population * .10
|
|
|
|
|
|
|
|
Which country has the highest average death count?
Let’s make a table depicting the high and low populated countries and their respected death count due to pollution.
|
|
Here’s a graph to clearly visualize the previous table
So we’ve looked at the deaths due to pollution, but what percentage of the population was affected?
|
|
Which type of pollution has the greatest number of deaths?
| country | avg_indoor | avg_outdoor | avg_ozone |
|---|---|---|---|
| Pakistan | 87.7427944 | 50.52063 | 10.440656 |
| Nigeria | 75.8755074 | 35.21678 | 2.117076 |
| Brazil | 19.4258385 | 26.84194 | 2.740342 |
| Germany | 0.7170881 | 25.47078 | 2.343892 |
| Australia | 0.2485867 | 17.20789 | 0.360452 |
| United States | 0.1656402 | 22.79947 | 3.915093 |
| country | avg_indoor | avg_outdoor | avg_ozone |
|---|---|---|---|
| Canada | 0.0651156 | 16.38423 | 1.9697041 |
| Chile | 8.6932699 | 27.17442 | 0.8504919 |
| Malawi | 132.1891749 | 13.81151 | 3.3870514 |
| New Zealand | 0.2908622 | 15.56872 | 0.0727512 |
| Serbia | 35.8762796 | 42.71254 | 2.9395671 |
| Sri Lanka | 44.5428441 | 24.77233 | 0.4304406 |
Let’s look at the previous two decades and compare the death count
This is the first decade 1996-2006has there been a change?
|
|
|
|
Let’s graph the previous tables!
The first decade 1996-2006.
This shows the second decade 2007-2017.
By comparing each pollutant type, we can determine which year and country had the highest numbers of deaths
Indoor Deaths
Outdoor Deaths
Ozone Deaths
outdoor or indoor pollution?
Let’s reintroduce a graph we looked at earlier. Instead this time we will combine the pollutant types together.
We cannot conclude which is worse.
[https://www.kaggle.com/datasets/akshat0giri/death-due-to-air-pollution-19902017 ]
[https://www.epa.gov/ground-level-ozone-pollution/ground-level-ozone-basics]
[https://www.health.nsw.gov.au/environment/air/Pages/outdoor-air-pollution.aspx]
[https://www.kaggle.com/datasets/imdevskp/world-population-19602018]